Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Feature selection algorithm for high-dimensional data with maximum correlation and maximum difference
Shengjie MENG, Wanjun YU, Ying CHEN
Journal of Computer Applications    2024, 44 (3): 767-771.   DOI: 10.11772/j.issn.1001-9081.2023030365
Abstract125)   HTML4)    PDF (698KB)(73)       Save

Aiming at the problems of redundant information and too high dimension in high-dimensional data, a Maximum Correlation maximum Difference feature selection algorithm (MCD) based on the maximum correlation of information quantity was proposed. Firstly, the correlation between Mutual Information (MI) measurement features and labels was used to sort and select features with the largest mutual information into feature subsets according to the relevant knowledge of information theory. Then, the information distance was introduced to measure the information redundancy and difference between the two features, and the evaluation criteria were designed to evaluate each feature, so that the correlation between the features and labels, and the difference between the features were the largest. Finally, the forward search strategy combined with the evaluation criteria was used to reduce the attributes and optimize the feature subset. Using 2 different classifiers, comparative experiments were carried out on 6 datasets with 5 classical algorithms such as mRMR (minimal-Redundancy-Maximal-Relevance criterion) and RReliefF, and the validity of MCD was verified by using the classification accuracy. Under the Support Vector Machine (SVM) classifier, the average classification accuracy increased by 5.67 - 23.80 percentage points, respectively; and under the K-Nearest Neighbor (KNN) classifier, the average classification accuracy increased by 2.69 - 25.18 percentage points, respectively. It can be seen that in the vast majority of cases, MCD can effectively remove redundant features and significantly improve classification accuracy.

Table and Figures | Reference | Related Articles | Metrics